NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ON SPEEDING UP LANGUAGE MODEL EVALUATION

Zhou, Jin Peng; Belardi, Christian K; Wu, Ruihan; Zhang, Travis; Gomes, Carla P; Sun, Wen; Weinberger, Kilian Q (June 2025, International Conference on Learning Representations)

Developing prompt-based methods with Large Language Models (LLMs) requires making numerous decisions, which give rise to a combinatorial search problem over hyper-parameters. This exhaustive evaluation can be time-consuming and costly. In this paper, we propose an adaptive approach to explore this space. We are exploiting the fact that often only few samples are needed to identify clearly superior or inferior settings, and that many evaluation tests are highly correlated. We lean on multi-armed bandits to sequentially identify the next (method, validation sample)-pair to evaluate and utilize low-rank matrix factorization to fill in missing evaluations. We carefully assess the efficacy of our approach on several competitive benchmark problems and show that it can identify the top-performing method using only 5-15% of the typical resources—resulting in 85-95% LLM cost savings. Our code is available at https://github.com/kilian-group/banditeval.
more » « less
Free, publicly-accessible full text available June 11, 2026
Correlator convolutional neural networks as an interpretable architecture for image-like quantum matter data

https://doi.org/10.1038/s41467-021-23952-w

Miles, Cole; Bohrdt, Annabelle; Wu, Ruihan; Chiu, Christie; Xu, Muqing; Ji, Geoffrey; Greiner, Markus; Weinberger, Kilian Q.; Demler, Eugene; Kim, Eun-Ah (December 2021, Nature Communications)
null (Ed.)
Abstract Image-like data from quantum systems promises to offer greater insight into the physics of correlated quantum matter. However, the traditional framework of condensed matter physics lacks principled approaches for analyzing such data. Machine learning models are a powerful theoretical tool for analyzing image-like data including many-body snapshots from quantum simulators. Recently, they have successfully distinguished between simulated snapshots that are indistinguishable from one and two point correlation functions. Thus far, the complexity of these models has inhibited new physical insights from such approaches. Here, we develop a set of nonlinearities for use in a neural network architecture that discovers features in the data which are directly interpretable in terms of physical observables. Applied to simulated snapshots produced by two candidate theories approximating the doped Fermi-Hubbard model, we uncover that the key distinguishing features are fourth-order spin-charge correlators. Our approach lends itself well to the construction of simple, versatile, end-to-end interpretable architectures, thus paving the way for new physical insights from machine learning studies of experimental and numerical data.
more » « less
Full Text Available
Product Kernel Interpolation for Scalable Gaussian Processes

Gardner, J; Pleiss, G; Wu, Ruihan; Weinberger, K; Wilson, A (April 2018, AISTATS 2018)

Recent work shows that inference for Gaussian processes can be performed efficiently using iterative methods that rely only on matrix-vector multiplications (MVMs). Structured Kernel Interpolation (SKI) exploits these techniques by deriving approximate kernels with very fast MVMs. Unfortunately, such strategies suffer badly from the curse of dimensionality. We develop a new technique for MVM based learning that exploits product kernel structure. We demonstrate that this technique is broadly applicable, resulting in linear rather than exponential runtime with dimension for SKI, as well as state-of-the-art asymptotic complexity for multi-task GPs.
more » « less
Full Text Available

Search for: All records